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Abstract 

The volatility of financial instruments is rarely constant, and usually varies over time. This creates a 
phenomenon called volatility clustering, where large price movements on one day are followed by similarly 
large movements on successive days, creating temporal clusters. The GARCH model, which treats volatility 
as a drift process, is commonly used to capture this behavior. However research suggests that volatility is 
often better described by a structural break model, where the volatility undergoes abrupt jumps in addition 
to drift. Most efforts to integrate these jumps into the GARCH methodology have resulted in models which 
are either very computationally demanding, or which make problematic assumptions about the distribution 
of the instruments, often assuming that they are Gaussian. We present a new approach which uses ideas from 
nonparametric statistics to identify structural break points without making such distributional assumptions, 
and then models drift separately within each identified regime. Using our method, we investigate the 
volatility of several major stock indexes, and find that our approach can potentially give an improved fit 
compared to more commonly used techniques. 



1. Introduction 

Volatility clustering is often observed in the return series of financial instruments [531 I33| • This phe- 
nomena is best illustrated by an example. Let S t denote the price of some financial instrument at a set 
of equally spaced discrete time points t = {1,2,...}, and let the return series be the log-increments 
r t = log St — log St—i- The volatility of the instrument is defined as the standard deviation of these returns. 
A typical example of a financial return series can be seen in Figure [T] which shows the daily returns of 
the Dow Jones stock index over a 20 year period ranging from January 1991 to August 2011. It can be 
observed that the standard deviation is not constant, but instead varies over time. In particular, note that 
the period from 2003 to 2007 seems to have noticeably lower volatility than the period immediately before 
or after. Similarly, in 2008 there are many extreme return values which occur in close succession, pointing 
to an abnormally high volatility during this period. 

Volatility clustering refers to this notion that large/small returns tend to be followed by similarly 
large/small values, which results in extended regimes of abnormally high or low volatility. This has been 
empirically observed in many different financial time series, and poses a problem for traditional financial 
models, which have typically assumed that the volatility is roughly constant over time. The last 25 years 
have seen an increasing number of attempts to model the time- varying nature of volatility, and the general- 
ized autoregressive conditional heteroskedasticity (GARCH) model [2], along with its many variants, is now 
the de-facto standard. The idea behind GARCH is that the volatility undergoes a stochastic drift process, 
where the conditional volatility at time t is a random variable, with a conditional distribution which depends 
on the long term volatility, the volatility during the most recent period, and the most recent values of the 
return series. 

However the gradual drift process underlying the GARCH model seems to be empirically violated in 
many real financial series. In some cases, volatility seems to behave more like a jump process, where it 
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Figure 1: Daily returns of the Dow Jones index between January 1984, and August 2011 

fluctuates around some value for an extended period of time, before undergoing an abrupt change, after 
which it fluctuates around a new value. This can be seen in Figure [I] around the year 1996, where the 
volatility spontaneously increases for a period of several years, before dropping to a lower value during 
2003. Since the standard GARCH model does not contain the possibility of these sudden jumps, it tends to 
overestimate the degree of long term volatility persistence. This has prompted the development of regime- 
switching GARCH processes which can incorporate jumps (TTl [T3]. In these models, the return series is 
allowed to contain multiple change points which segments it into regimes, with the GARCH model having 
different parameters within each segment. 

However, such models can be hard to estimate. Although there are computationally efficient procedures 
for estimating multiple change points in simpler ARCH models [HI HH1 HHj, the long-range dependence 
introduced by the GARCH formulation makes such approaches difficult to apply. Standard techniques for 
fitting multiple change point models to data assume independence between segments [8] which is not the 
case in the GARCH framework. Although some recent attempts to fit such models have been attempted 
[13] . it remains a difficult numerical procedure. Therefore, the most popular strategy is to instead use the 
approximate procedure introduced by [T] where the model is fitted in stages, with the abrupt change points 
first being located using the iterated cumulative sum of squares (ICSS) algorithm [T3], before a GARCH 
model then estimated conditional on these change points. This ICSS-GARCH algorithm has been used to 
study a wide variety of financial time series. For example, [5] uses it to study the volatility of the US dollar 
exchange rate against several different currencies, [3T] studies the returns of the Canadian stock exchange, 
[T7] does likewise for the Japanese and Korean exchanges, and [TH] analyses the market for crude oil 

Although ICSS-GARCH is simple to implement and has been shown to give improved results compared 
to standard GARCH models, it is not without its problems [301 [37]. The parameters of the ICSS algorithm 
are usually designed under the assumption that the financial returns follow a Gaussian distribution, and it 
can produce many spurious jump points if this assumption is violated. We will show later that applying 
ICSS to heavy-tailed series can give poor results, since extreme observations are misinterpreted as being 
regime shifts. Unfortunately, it has now been conclusively established that financial data is very rarely 
Gaussian, and return series typically exhibit heavy tail behaviour [321 [301 [251 [2H HO] . Similar heavy tail 
behaviour has been observed in many financial series and is not limited to asset returns |26j . 

This limitation of the ICSS-GARCH methodology has meant that it is usually only used to detect 
change points in the weekly returns of financial instruments, i.e. where S t and S t +i are one week apart 
[5J [T71 [T|>1 HH [T] . Using the algorithm on daily returns can generate too many spurious false positives for 
it to be useful, due to the number of extreme values. This is a problem since the daily returns are more 
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fine-grained and hence using them should allow more accurate volatility modelling. Therefore, it is desirable 
to find a way to use this data when it is available. 

In this paper we present an alternative to the ICSS-GARCH algorithm which is better suited for dealing 
with the heavy tailed, non-Gaussian data which is typical in finance. We replace the ICSS segmentation 
step of ICSS-GARCH with an alternative technique based on non-parametric statistics, which docs not 
make any assumptions about the true returns distribution. This allows it to entirely avoid the Gaussianity 
assumption and allows it to be deployed on daily returns. Our approach is based on the nonparametric 
change point model framework described in [JSIEE], and we hence refer to it as NPCPM-GARCH. Using 
this technique, we analyse several stock indexes for volatility change points, specifically focusing on the Dow 
Jones Industrial Average, the German DAX, the VIX volatility index, and the Japanese Nikkei 225. We 
compare our results with those of ICSS-GARCH, and find that our method generally gives a better fit to 
the data when measured using standard criteria. This suggests that it could be a widely useful tool for 
modelling volatility in other contexts. 

The remainder of the paper proceeds as follows. We begin in Section[2]by describing the ICSS step of the 
ICSS-GARCH algorithm. We explain why it gives poor performance when used with heavy tailed data, and 
give a simulated example using Studcnt-t to show this. In Section [3] wc introduce our new nonparametric 
approach. We then briefly review the GARCH stage of the algorithm in Section 3.3, and in Section [4] we 
present an empirical evaluation of our method on a range of foreign exchange series. 



2. The ICSS-GARCH Algorithm 

The ICSS-GARCH methodology has two stages. Given a financial returns series, the ICSS algorithm is 
first used to detect any change points in the volatility, and the series is then segmented around these points. 
Then, a separate GARCH model is fitted to each segment. We begin by giving overview of the ICSS stage, 
before pointing out its problems and introducing our alternative. Next, we review the GARCH estimation 
stage. 

2.1. Stage 1: Change Point Detection using ICSS 

The Iterated Cumulative Sum-of-Squares algorithm is based on the work of [14] . who proposed a retro- 
spective technique for detecting changes in the variance of a financial time series. Given a series of financial 
returns r\, T2, . . . , r n with mean 0, define the cumulative sum of squares as Ct = X)j=i r t > an d ^ 

A = -i--, t = i,...,n, A = A = 0. 

G„ n 

If the sequence has constant variance, then the value of A will oscillate around 0. However if the 
variance undergoes an abrupt change at some point r < n then the value of D n will exhibit extreme 
behaviour around this point with its magnitude becoming unusually large. Change detection is carried out 
by defining a threshold h n , and comparing the maximum value of D n to this. Specifically, a change is flagged 
if: 



^/n/2max|A| > h n , (1) 

where the \Jnj1 factor is included for standardization purposes. If the threshold is exceeded then the 
estimate of the change point, which we denote r, is located at the value of t which gave the maximum value 
of |A|, i-e. f = argmax t |A|- 

In cases where the series may contain multiple change-points, the above procedure can be iterated. The 
ICSS algorithm is first applied to the full series. If a change is flagged, and estimated to be at location fo- 
The series is then split into two segments; A = {7*1, r2, . . . , rf _i} and B = {r T , r T+ i, . . . , r n } around this 
point. Then, the ICSS algorithm is recursively applied to both segments A and B separately, in the same 
manner as before. If a change point is flagged in segment A, and estimated to be at location t\ < fo, then 
segment A is further subdivided into two segments around point f\, and the ICSS algorithm is applied to 
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(a) Change points detected by ICSS 
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(b) Change points detected by NPCPM 

Figure 2: The red lines denote the volatility changes identified by the ICSS and NPCPM algorithms in a 
typical simulated series of Student-t(3) random variables with true change points at times 200 and 400. 

these new segments, and so on. The same procedure is likewise applied to segment B. This produces a 
sequence fo, fi, . . . of estimated change points. 

Deploying the ICSS algorithm requires specifying the threshold h n . In the original paper [T3j, this is 
chosen in order to control the probability of mistakenly concluding the a change has occurred, if it fact there 
is no change. Let a be the probability of this occurring. The authors show that if the observations are 
Gaussian, then choosing h n = 1.358 asymptotically gives a value of a = 0.05 assuming that the observations 
are Gaussian. This is the value which has typically been by other papers using the ICSS-GARCH algorithm 
[H [T71 [IB] . Note that if the observations are not Gaussian, then the actual value of a obtained for this choice 
of h n may be radically different from 0.05 - this is the crux of the problem with the ICSS algorithm in the 
context of financial data. 

2.1.1. Non- Gaussian Data 

The ICSS algorithm is very easy to implement and does not require much computational resources, which 
is one of the reasons why it has been widely adopted. However its reliance on the Gaussian distribution when 
specifying the threshold h n is problematic, since financial data is known to be non-Gaussian, and can exhibit 
heavy tailed behavior. The justification for the Gaussian assumption comes from the central limit theorem; 
if there are no change points, then Cj~ is asymptotically Gaussian since it is a sum of independent and 
identical random variables. However asymptotic arguments often fail in practice, where we are concerned 
with finite length return series. The basic problem is that, because financial return series are heavy tailed, 
there will occasionally be large values generated which are interpreted as change points, even though they 
should more correctly be classed as outliers. 

We illustrate this by deploying the ICSS algorithm on a simulated series of Student-t random variables, 
which is a standard distribution used to model heavy tailed behaviour. The series consists of 600 independent 
observations . The first 200 observations have a standard Student-t(3) distribution, which has mean and 
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variance of 3 . The next 200 observations come from a scaled Student t(3) distribution with mean and 
variance 12. Finally, the last 200 observations are again Student-t(3) with mean and variance 3. The series 
hence consists of 3 regimes, with a volatility shift in each. We stress that by design the volatility between 
the change points at times 200 and 400 is constant, therefore any change points flagged in these regions are 
spurious false positives. 

We simulated 1000 realisations of such a series, and applied the ICSS to each. On average, ICSS detected 
5.4 different regimes in each sequence, which is almost three times the true number. A typical realisation of 
the series is shown in Figure [2a[ with the change points discovered by ICSS plotted as red lines, and it can 
be seen that many spurious change points are generated. The problem is that the ICSS algorithm is based 
on the squared magnitudes of the returns, rf. However when the observations are heavy-tailed, extreme 
values can be produced even though nothing in the series has changed. The ICSS algorithm incorrectly 
interprets these extreme values as being jumps in volatility. This suggests that the ICSS algorithm will not 
work on daily financial return series which exhibit similar heavy tail behaviour, and our analysis in Section 
2] will confirm this. 

3. NPCPM: A Nonparametric Alternative to ICSS 

The limitation of the ICSS algorithm is hence the assumption that the returns are Gaussian. We therefore 
propose replacing the ICSS stage of ICSS-GARCH algorithm with a new technique which makes no such 
distributional assumptions, based on the sequential change detection work in [21]. We call this approach the 
Nonparametric Change Point Model (NPCPM). Recall from the discussion of ICSS that it was configured to 
have a 0.05 probability of incurring a false positive, given that there are no changes in the return series, and 
that the Gaussian assumption holds. We wish to retain this fixed 0.05 false positive probability, regardless 
of the true return distribution. This can be done by adopting the idea of rank tests from the field of 
nonparametric statistics. These tests are easy to understand and implement, yet are very powerful and able 
to maintain a fixed rate of false positives regardless of the true distribution of the return series. We first 
review a standard non-parametric test for comparing two samples of observations and testing whether they 
have the same variance, when their distribution is unknown. Then, we show how this test can be extended 
to detect changes in volatility. 

3.1. Two Sample Testing 

Suppose that we have two samples of observations A = {rii, ri 2> ■ ■ ■ > r i,m}, B — {r2,i> 7*2,21 • • • > r 2,n}i 
with an unknown heavy-tailed distribution, and we wish to test whether they have equal variance. One 
commonly used method for this is the Mood test [22] . This consists of replacing each observation with its 
rank, which is defined as the number of observations in the combined sample which it is greater than. More 
formally, the rank of each observation r.j is: 



So for example, if the first sample contains the observations (1.02,1.32,2.17) and the second sample 
contains (0.87, 1.21, 1.89) then the observation 0.87 has rank 1, the observation 1.02 has rank 2, and so on. 
The key point in the theory of rank tests is that if both samples have the same distribution, then each 
observation is equally likely to have any of the m + n possible ranks. This is true regardless of what the 
true distribution is, and no matter how heavy tailed it is. Therefore, any test statistic which depends only 
on the ranks of the observations will not depend on their distribution. 

The Mood test for equal variance measures the extent to which the rank of each observation deviates 
from the median rank. If both samples have an identical distribution, then the median rank is simply 
(71 + m + l)/2. If the observations all have the same distribution, then we would expect the ranks to be 
roughly equally split between the two samples. However if the variance of the samples differs, then the one 
with higher variance will typically significantly more extreme observations than the other. This leads to a 
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Table 1: Values of the h n threshold which give a 0.05 probability of incurring a spurious false positive when 
using the maximized Mood statistic, for various lengths n of the financial series. 

test statistic based on summing the squared rank deviations from either of the samples, and comparing it 
to a threshold: 

m 

M' m>n = 5>ank(r M ) - (n + m + l)/2) 2 . 

i=i 

The expected value and standard deviation of this M' m n statistic depends on the sample sizes m and n. 
To make it easier to compare values evaluated on different sample sizes, we standardise it by subtracting its 
mean and dividing by its variance. From [2 2) this can be shown to be: 

M = (\M' — hm'\)/&m', Vm' =m(N 2 - 1)/12, a\ v = mn{N + 1)(N 2 - 4)/180, N = m + n. 

Finally if M' m n > h n for some appropriately chosen threshold, then we conclude that the two samples 
have unequal variance. The value of h m ^ n can again be chosen to (e.g.) give a 0.05 probability of falsely 
concluding that the samples have unequal variance when they are in fact equal. Unlike the ICSS approach, 
this threshold can be chosen in a way that allows this probability to hold regardless of how the returns are 
distributed. 

3.2. Change Detection 

Given the return series n, . . . ,r±, we wish to test whether there is a change in volatility. Assuming for 
now that there is at most a single change point, we can think of this as being a compound problem; we first 
test whether there is a change point immediately after the second observation, then test if there is a change 
point after the third observation, and so on. More formally, we wish to decide between the hypothesis that 
there is no change point in the series, and the hypothesis that there is a change point at observation for 
some unknown value of 1 < k < t. 

For each possible value of k, split the observations into two samples {rj., r2, . . . , r^}, {rk+i, Pfc+2j ■ ■ • , r t }. 
Then, the Mood test can be applied to these two samples in order to compare whether they have equal 
variance as before. Let Mk, n be the computed value. By repeating this procedure over all values of k, the 
following maximized test statistic can be defined: 

M t = maxM fe „ 

it 

The test then consists of comparing this maximized statistic to a threshold h t . As before, if M n > h n 
for an appropriate threshold, we conclude that a change has occurred, with the best estimate of the change 
point then being f = argmax^ \Dk\- Note the similarity between this, and the ICSS statistic in equation [lj 
In both cases, we are essentially performing a test at each individual point in the sequence, and picking out 
the value which maximises it. 

The final step is specifying the value of h n . Similar to the ICSS algorithm, we wish to choose this so 
that the probability of incurring a false positive is equal to either 0.05; it should therefore be chosen as the 
95 th percentile of N n . Unlike this ICSS algorithm, doing this will guarantee a false positive probability of 
0.05 regardless of the return distribution. These values can be easily found using Monte Carlo simulation. 
In Table [T] we list the h n values which give a false positive probability of 0.05, for various lengths of the 
financial series. 
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In cases where the series may contain multiple change-points, we use the same recursive approach as in 
the ICSS algorithm. We first run our method on the whole series, and compare M t to the threshold h n . If 
it exceeds it, let To = argmax^ \D^\. The observations are then split into two samples around To, and the 
change detection algorithm is recursively applied to each sample until the threshold is no longer exceeded. 

To illustrate the advantage of our approach over ICSS when working with non-Gaussian data, we applied 
it to the same heavy-tailed Student-t data discussed in the previous section. Based on 10000 simulations, the 
NPCPM algorithm on average detects 2.1 change points per sequence, compared to both the true number of 
2, and the average of 5.4 found by the ICSS algorithm. This highlights that the NPCPM approach is much 
more accurate when working with non-Gaussian data. Figure |2b| shows the volatility change points which 
are identified in a typical realisation of the Student-t (3) series. Unlike the ICSS algorithm, our approach 
does not typically generate spurious false positives even though the observations are heavy-tailed. This 
shows that it is better able to cope with heavy tailed observations, and should be better suited to financial 
data. 

3.3. Stage 2: G ARCH Modelling 

After the jump points have been found using either ICSS or NPCPM, the next step is to model the 
volatility drift in the segments between each pair of change points. This is done using the GARCH(p,<?) 
model [2], where the conditional variance of the returns obeys an autoregressive moving average process, 
with p and q denoting the time lags. In practice, by far the most common version of this model is the 
GARCH(1,1), which we also use. A financial time series is said to be GARCH(1,1) if its volatility ht has 
the following time-varying form: 

r t ~ h t e t 
h t =u> + aht-i + /9rf_ x , 

where et is a sequence of independent and identically distributed random variables. In other words, the 
volatility at time t is a function of the long term volatility (w), the variance at the previous time point 
(/it_i), and the squared previous return (r t _i). This reliance on previous values leads naturally to the 
volatility clustering effect as seen in Figure [l] The distribution of e t is often taken to be Gaussian A(0, 1), 
but we will also consider the case where the et variables have a Student-t distribution with v degrees of 
freedom as in [3], in order to model heavy tail behaviour. 

One limiting feature of the GARCH model is that the volatility is mean reverting and fluctuates around 
a fixed value. As discussed in the Introduction, it is often more realistic to use a regime-switching/change 
point formulation where the parameters of the GARCH model, and hence the long-run volatility, can take 
different values in each segment. In this case, the segment boundaries are the change-points found by the 
ICSS or NPMLE algorithms. We consider two different GARCH models; the first is the one used in [Tl I2TI [IB] 
where only the ui parameter undergoes change, i.e: 

h t = uj t + ah t -i + fir1_ x , 

where uj t is equal to some constant fco until the first change point, before switching to k\ until the next 
change point, and so on. In the second regime-switching model all three parameters are allowed to vary 
between regimes, i.e: 

h t = u>t + utht-i + PtTt-i- 

This gives a more flexible model, at the risk of overparameterization. We will refer to the model where 
only (jj changes as the w-GARCH model, and the one where all parameters change as tuaft— GARCH. Note 
that when using the models with Student-t error with v degrees of freedom, we treat v as a free parameter 
which is estimated along with the GARCH coefficients. 
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(c) Nikkei (d) VIX 

Figure 3: Daily closing prices for the stock indexes we are investigating 

4. Empirical Results 

Our change point study uses four world stock indexes, which are 1) the Dow Jones Industrial Average 
which consists of 30 large companies in the United States, 2) the Deutscher Aktien Index (DAX) which 
consists of the 30 large Germany companies, 3) the Nikkei 225 which consists of 225 Japanese countries, and 
4) the VIX volatility index, which measures the implied volatility of the companies from the S&P 500. We 
obtained daily closing prices for each series between the 1 st of January 1991, and the 31 st of October 2011. 
Figure [3] shows a plot of each of these four series. 

For each series, we analyze the logarithm of the daily price differences defined as r t — log St — log5j_i. 
Table [2] displays some summary statistics for each sequence of differences. It can be seen that all have a 
mean of near 0, as should be expected. All series exhibit kurtosis far in excess of what would be expected if 
they followed a Gaussian distribution (recall that the Gaussian distribution has a kurtosis of 0). To test this 
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Table 2: Summary statistics for the series of daily differences for the four stock indexes. The bottom half 
shows the p- values obtained for several statistical tests, where the 2 symbol denotes that it has been applied 
to the squared differences. 
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(c) Nikkei (31 change points) (d) VIX (30 change points) 

Figure 4: Change points discovered in the daily differences for each stock index using the ICSS algorithm 

further, we show the p-values of the standard Shapiro- Wilk test [3T] for Gaussianity. The small p-values 
show that the Gaussian hypothesis should be rejected for all series. Finally, we give the p-values associated 
with the Ljung Box test for autocorrelation, in both the original series of differences, and their squared 
values. If the volatility of each series was constant then we would expect there to be no autocorrelation in 
the squared differences; the low p-values obtained for all series show that this hypothesis should be rejected, 
and that the volatility is not constant. 

We next investigate the change points which are found by the ICSS and NPCPM algorithms in these 
series. After this, we will fit the full GARCH model, and compare these two methods more formally. 

4-1. Change Point Analysis 

We begin by investigating the change points which are discovered by both the ICSS and NPCPM algo- 
rithms. We configured both algorithms to have a significance level of a = 0.05 as discussed previously. In 
Figure [4] we show the change points which were detected by the ICSS algorithm. It can be seen that there 
are a very large number of change points detected, with 45 for the Dax index, and 31, 30, and 22 for the 
others respectively. Most of these do not seem to correspond to genuine long term changes in the volatility; 
as we would expect from our discussion in Section |2.1.1[ many seem to be false positives flagged in response 
to the extreme values for the daily differences which sometimes . 

In Figure [5] we show the results of the NPCPM algorithm applied to the same series. In contrast to 
the ICSS, there are fewer change points detected, suggesting that this is giving a better fit to the data. 
Unlike ICSS, the change points found by NPCPM do not seem to correspond to the outlying observations, 
suggesting robustness. In the following section we will use standard model fitting criteria to give a more 
quantitative determination of which algorithm is more accurately finding change points. 

Having completed our preliminary analysis of the change points, we now fit the change point GARCH 
models. As discussed in Section [3~3"1 we consider several different types of models. As a benchmark, we fit 
GARCH model with no change points, using both the Gaussian and Student t distributions for the error 
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distributions. Next, we fit the uj — GARGH and af3u> — GARCH where the segment boundaries correspond 
to the change points found by the ICSS and NPCPM algorithms. 

As a final modeling remark, it is possible that the large number of change points found by the ICSS 
and NPCPM algorithms are an artifact of the two-stage process we are using to fit the models. Both of 
these change point detection algorithms assume that the observations are independent, however since we 
are applying these algorithms before the GARCH model is fit, it is possible for the autocorrelation in the 
volatility to cause an unusually high number of false positives. We therefore also considered a three-stage 
model fitting procedure of the following form: first, a GARCH (1,1) model is fit to the return series and the 
conditional variance of is estimated on each day. This is then used to standardize the observations via the 
transformation y t = r t /a t . If the GARCH model correctly fits the data and does not contain any change 
points then these transformed variables should be independent with variance 1. The ICSS and NPCPM 
algorithms are then applied to these transformed variables to find any change points. Finally, separate 
GARCH models are fit within each of the discovered regimes. When using this procedure, the number of 
change points found drops substantially with the ICSS procedure finding 3, 7, and 8 change points in the 
four indexes respectively, with the NPCPM finding 1,2,0 and 1, both of which are substantial reductions 
compared to the number found when running the algorithm on the raw sequences. In the following discussion 
we will refer to the models fit in this manner as GICSS and GNPCPM respectively, to denote the fact that 
the algorithms are applied to the residuals from an initial GARCH fit rather than to the raw data. 

4.2. GARCH Model Fitting 

In order to compare which models best describe the data, we use the standard Akaike Information 
Criterion (AIC) and Bayesian Information Criterion (BIC) model comparison metrics [UGH]. Both of these 
measure how well a model fits the observed data, based on the likelihood of the data under the model, with 
a penalization for the number of parameters in the model. This penalization is necessary in order to prevent 
overfitting, and balance out the increase in the likelihood that an overparameterized model will generally 
have. The AIC is defined as: 

AIC = 2k-2 log(X) 

where L is the likelihood of the model under the MLE parameter estimate, and k is the number of 
parameters in the model. Similarly, the BIC is defined as: 

BIC = fclog(n) - 21og(L), 

where L and k are as before, and n is the number of observations. With both measures a low value 
indicates a better fit. The practical difference between the two criterion is that the BIC penalizes model 
parameters to a greater degree than the AIC. There is some controversy over which of the two criteria is 
more appropriate, see [4] for a review. We choose to report both, and the results are shown in Table |3]We 
can draw several conclusions: 





L 


Dow .lone 
AIC 


BIC 


L 


DAX 
AIC 


BIC 


L 


Nikkei 
AIC 


BIC 


L 


VIX 
AIC 


BIC 


tu-GARCH, ICSS, Gaussian 
w — GARCH, ICSS, Studcnt-t 
w-GARCH, NPCPM, Gaussian 
w-GARCH, NPCPM, Studcnt-t 


16908 
16336 
16862 
16362 


-33761 
-32617 
-33688 
-32685 


-33584 
-32433 
-33570 
-32560 


15561 
15057 
15245 
14691 


-31024 
-30014 
-30435 
-29325 


-30702 
-29685 
-30251 
-29134 


7480 
7002 
7573 
7090 


-14895 
-13938 
-15113 
-14146 


-14685 
-13721 
-15008 
-14034 


13691 
13845 
13756 
13755 


-27311 
-27618 
-27463 
-27461 


-27082 
-27382 
-27306 
-27297 


a£aj-GARCH, ICSS, Gaussian 
a/jw-GARCH, ICSS, Studcnt-t 
a/3w-GARCH, NPCPM, Gaussian 
a^w-GARCH, NPCPM, Studcnt-t 


17229 
17333 
17178 
17307 


-34277 
-34483 
-34255 
-34512 


-33679 
-33885 
-33920 
-34177 


15958 
15992 
15836 
15944 


-31549 
-31619 
-31499 
-31714 


-30346 
-30416 
-30927 
-31143 


14855 
14903 
14830 
14878 


-29440 
-29536 
-29485 
-29582 


-28556 
-28653 
-28916 
-29012 


7761 
7885 
7630 
7823 


-15276 
-15525 
-15173 
-15560 


-14468 
-14717 
-14891 
-15278 


Q /3oj-GARCH, GICSS, Gaussian 
q/3w-GARCH, GICSS, Studcnt-t 
Q/9CJ-GARCH, GNPCPM, Gaussian 
«/3w-GARCH, GNPCPM, Studcnt-t 


17162 
17260 
17140 
17246 


-34294 
-34490 
-34267 
-34478 


-34195 
-34391 
-34221 
-34432 


15800 
15872 
15713 
15838 


-31537 
-31681 
-31405 
-31655 


-31334 
-31477 
-31332 
-31582 


-14693 
-14768 
-14693 
-14768 


-29380 
-29529 
-293S0 
-29529 


-29360 
-29509 
-29360 
-29509 


7654 
7809 
7564 
7716 


-15239 
-15548 
-15113 
-15419 


-15009 
-15319 
-15067 
-15373 



Table 3: The log-likelihood, AIC, and BIC associated with each model on the four stock indexes. The bold 
text shows the model which gives the best fit as measured by each criterion. 
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(a) Dow Jones (12 change points) (b) DAX (21 change points 




Date Date 



(c) Nikkei (21 change points) (d) VIX (10 change points) 

Figure 5: Change points discovered in the daily differences for each stock index using the NPCPM algorithm 

• The models using Student-t errors distribution consistently outperform the Gaussian models. This 
suggests that even with the time-varying volatility allowed by the GARCH process, the Gaussian 
distribution still cannot adequately capture the heavy tailed nature of these series. Similar results 
have been noted by [T5] . 

• The best fits in terms of likelihood are given by the various af3ui — GARCH models. This shows the 
advantage of incorporating structural breaks into the GARCH framework 

• The full uiaf3 — GARCH models which allow all parameters to vary generally outperform the more 
parsimonious u> — GARCH models, even when factoring in the likelihood penalty imposed by BIC / AIC 

• According to the AIC performance measure, the wa/? — GARCH model using NPCPM change points 
is the best fitting model for every stock index. Although the ICSS methods give comparable values 
for the likelihood, the exaggerated number of change-points they produce means that give a poorer fit 
overall. 

• When the BIC performance measure is used instead, the three stage model where the NPCPM algo- 
rithm is applied the residuals of an initial GARCH fit gives the best BIC for three of the four indexes. 
The exception is the Nikkei index for which no change points were found by either the ICSS or NPCPM 
methods when applying these to the GARCH residuals, in which case their BIC results are tied. 

In summary, the fact that the NPCPM algorithm does not assume Gaussianity means that it is more 
robust to outliers than the ICSS, and this results in more parsimonious change detection when using daily 
returns. This is reflected in the performance criteria used to assess model fit, which shows that it gives 
improved overall results. We would hence generally recommend the uja/3 — GARCH model using the 
Student-t error distribution in conjunction with the NPCPM algorithm when modeling volatility. 
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Regime 


Volatility 


2nd January 1991 - 16th May 1991 


o 


011 


17h May 1991 - 30th December 1996 


() 


007 


31st December 1996 - 16th June 2002 





012 


17th June 2002 - 23rd September 2002 





020 


24th September 2002 - 17th October 2002 





028 


18th October 2002 - 25th July 2003 





013 


26th July 2003 - 16th August 2006 





007 


17th August 2006 - 18th July 2007 





006 


19th July 2007 - 14th September 2008 





013 


15th September 2008 - 9th December 2008 





042 


10th December 2008 - 1st June 2009 





020 


2nd June 2009 - 7th August 2011 





010 


8th August 2011 - 16th November 2011 





019 



Table 4: Unconditional volatility in each segment found by the NPCPM algorithm 



We note that we could also have compared these different methods for volatility estimating by treating 
them as predictive models and comparing their out-of-samplc forecasting errors according to some standard 
criteria such as the mean squared error (MSE). However the question of whether change point models 
are appropriate for short-term volatility forecasting is still controversial, and there is some evidence [5] 
that standard GARCH models may perform better for this purpose depending on the precise performance 
measure which is used. Because our main concern is volatility estimation rather than forecasting, we prefer 
to avoid this issue and use performance measures which relate only to (penalized) model fit. 

4-3. Further Analysis 

After detecting the change points in the above section, it is potentially interesting to investigate whether 
they correspond to events in the real world. In this section we examine the Dow Jones index in more detail. 
Using the NPCPM GARCH algorith, there were 12 change points detected. For each of the dates at which 
the changes occurred, we searched through news headlines from the week immediately before and after to 
find whether any major events occured which may be related. For 5 of the change points, we managed 
to find significant economic events which occured within several days and may have been the cause of the 
volatility shifts: 



26th July 2003: two days earlier on the 24th July, the S&P credit rating agency cut the rating of 
California bonds from A to BBB. 



• 19th July 2007: the following week, the Dow Jones index experienced a substantial 2.3% drop over 
concerns about the housing and credit markets. The volatility increase may have anticipated this. 



15th September 2008: on this date Lehmann Brothers filed for bankruptcy, the event which signaled 
the start of the recent financial crisis. 

2nd June 2009: on the previous day, General Motors filed for bankruptcy. 

8th August 2011: three days earlier, S&P downgraded the credit rating of the United States. 



For the remaining 7 change points we did not find any specific events. For reference, these were 17th 
May 1991, 31st December 1996, 17th June 2002, 24th September 2002, 18th October 2002, 17th August 
2006, and 10th December 2008. This seems slightly puzzling since the first two change points in 1991 and 



1996 correspond to very clear structural change points which can be seen in Figure 5a with the second one 
marking a pronounced switch from a period of low volatility to high volatility. Since there are no specific 
associated news events, it is possible that these changes occurred in response to longer term trends in the 
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markets or economic system rather than being responses to specific events. For example, the bursting of the 
internet bubble caused a prolonged stock market downturn during 2002, during which the Dow Jones lost 
almost 17% of its value, with most of this occurring between May and October. It seems probable that the 
change points found by NPCPM in June and September are caused by this, even though are are no high 
profile news events around these dates specifically. 

To investigate further, Table [4] shows the unconditional volatility in each of the segments. It can be 
seen that the most volatile period was unsurprisingly the three month period immediately following the 
bankruptcy of Lehmann Brothers in late 2008. After this, volatility decreased but was still high compared 
to the historical average. The other sustained period of high volatility occurred during 2002 and lasted from 
June to October. Since this corresponds quite closely to the stock market downturn, it seems reasonable to 
assume that it was an underlying factor which may have caused the discovered change points around the 
start and end of this period. 

5. Conclusions 

Many financial applications require an accurate estimate of the historical volatility of specified financial 
instruments. For example, certain types of derivatives are priced using the realized volatility, such as the 
popular Merton model which is used to price Credit Default Swaps and is usually estimated by using the 
volatility of the stock price as an input variable [T3]. Volatility calculations also feature extensively in 
risk management, with GARCH models finding regular use within traditional Value-at-Risk (VaR) analysis 
[7|. Similarly, accurate volatility estimation is the first step in computing the correlation between financial 
instruments, which is a central task in portfolio optimization [BJ. 

The ICSS-GARCH algorithm has been widely used to model the time varying volatility commonly found 
in financial returns. In this methodology, the ICSS algorithm is first used to segment the series based on 
discovered change points, before a GARCH model is fit to each segment. However, ICSS is very sensitive 
to heavy tailed data, and can flag for spurious change points when used in this setting. This is unfortunate 
since heavy-tailed behaviour is typical in financial data, and this has limited the use of the algorithm to 
the study of weekly returns, where large daily price movements are smoothed out. In order to work with 
daily data, we have introduced an alternative algorithm where we replace ICSS with a test utilizing ideas 
from nonparametric statistics. Our experimental analysis shows that this generally gives a better fit to daily 
data, as measured by several standard model selection techniques. 
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